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-Abstract- 

We present several sparsification lower and npper bounds for classic problems in graph theory and 
logic. For the problems 4-Coloring, (Directed) Hamiltonian Cycle, and (Connected) 
Dominating Set, we prove that there is no polynomial-time algorithm that reduces any n- 
vertex input to an equivalent instance, of an arbitrary problem, with bitsize for e > 0, 

unless NP C coNP/poly and the polynomial-time hierarchy collapses. These results imply that 
existing linear-vertex kernels for fc-NoNBLOCKER and fc-MAX Leaf Spanning Tree (the para¬ 
metric duals of (Connected) Dominating Set) cannot be improved to have edges, 

unless NP C coNP/poly. We also present a positive result and exhibit a non-trivial sparsihcation 
algorithm for d-NOT-ALL-EquAL-SAT. We give an algorithm that reduces an n-variable input 
with clauses of size at most d to an equivalent input with clauses, for any fixed d. Our 

algorithm is based on a linear-algebraic proof of Lovasz that bounds the number of hyperedges 
in critically 3-chromatic d-uniform n-vertex hypergraphs by We show that our kernel is 

tight under the assumption that NP ^ coNP/poly. 

1998 ACM Subject Classification F.2.2 Nonnumerical Algorithms and Problems, G.2.2 Graph 
Theory 
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[T] Introduction 

Background. Sparsification refers to the method of reducing an object such as a graph or 
CNF-formula to an equivalent object that is less dense, that is, an object in which the ratio of 
edges to vertices (or clauses to variables) is smaller. The notion is fruitful in theoretical [TB] 
and practical (cf. [lUj l settings when working with (hyper)graphs and formulas. The theory 
of kernelization, originating from the field of parameterized complexity theory, can be used 
to analyze the limits of polynomial-time sparsification. Using tools developed in the last 
five years, it has become possible to address questions such as: “Is there a polynomial-time 
algorithm that reduces an n-vertex instance of my favorite graph problem to an equivalent 
instance with a subquadratic number of edges?” 

The impetus for this line of analysis was given by an influential paper by Dell and van 
Melkebeek [5] (conference version in 2010). One of their main results states that if there is 
an £ > 0 and a polynomial-time algorithm that reduces any n-vertex instance of Vertex 
Cover to an equivalent instance, of an arbitrary problem, that can be encoded in 
bits, then NP C coNP/poly and the polynomial-time hierarchy collapses. Since any nontrivial 
input (G, k) of Vertex Cover has k < n = |U(G)|, their result implies that the number 
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of edges in the 2fc-vertex kernel for fc- Vertex Cover m cannot be improved to 0 { k ^ 
unless NP C coNP/poly. 

Using related techniques, Dell and van Melkebeek also proved important lower bounds 
for d-CNF-SAT problems: testing the satisfiability of a propositional formula in CNF form, 
where each clause has at most d literals. They proved that for every fixed integer d > 3, the 
existence of a polynomial-time algorithm that reduces any n-variable instance of d-CNF-SAT 
to an equivalent instance, of an arbitrary problem, with bits, for some £ > 0 implies 

NP C coNP/poly. Their lower bound is tight: there are 0(n‘^) possible clauses of size d over n 
variables, allowing an instance to be represented by a vector of 0{n‘^) bits that specifies for 
each clause whether or not it is present. 


Our results. We continue this line of investigation and analyze sparsification for several 
classic problems in graph theory and logic. We obtain several sparsification lower bounds 
that imply that the quadratic number of edges in existing linear-vertex kernels is likely to 
be unavoidable. When it comes to problems from logic, we give the—to the best of our 
knowledge—first example of a problem that does admit nontrivial sparsification: d-NOT- 
All-Equal-SAT. We also provide a matching lower bound. 

The first problem we consider is 4-Coloring, which asks whether the input graph has a 
proper vertex coloring with 4 colors. Using several new gadgets, we give a cross-composition [3] 
to show that the problem has no compression of size unless NP C coNP/poly. To 

obtain the lower bound, we give a polynomial-time construction that embeds the logical 
OR of a series of t size-n inputs of an NP-hard problem into a graph G' with 0{^/i ■ 
vertices, such that G' has a proper 4-coloring if and only if there is a yes-instance among 
the inputs. The main structure of the reduction follows the approach of Dell and Marx [7]: 
we create a table with two rows and 0{^/i) columns and vertices in each cell. For 

each way of picking one cell from each row, we aim to embed one instance into the edge set 
between the corresponding groups of vertices. When the NP-hard starting problem is chosen 
such that the t inputs each decompose into two induced subgraphs with a simple structure, 
one can create the vertex groups and their connections such that for each pair of cells (i, j), 
the subgraph they induce represents the i ■ '/i + j-th input. If there is a yes-instance among 
the inputs, this leads to a pair of cells that can be properly colored in a structured way. The 
challenging part of the reduction is to ensure that the edges in the graph corresponding to 
no-inputs do not give conflicts when extending this partial coloring to the entire graph. 

The next problem we attack is Hamiltonian Cycle. We rule out compressions of 
size for the directed and undirected variant of the problem, assuming N P ^ coN P/poly. 

The construction is inspired by kernelization lower bounds for Directed Hamiltonian 
Cycle parameterized by the vertex-deletion distance to a directed graph whose underlying 
undirected graph is a path [2]. 

By combining gadgets from kernelization lower bounds for two different parameterizations 
of Red Blue Dominating Set, we prove that there is no compression of size for 

Dominating Set unless NP C coNP/poly. The same construction rules out subquadratic 
compressions for Connected Dominating Set. These lower bounds have implications 
for the kernelization complexity of the parametric duals Nonblocker and Max Leaf 
Spanning Tree of (Connected) Dominating Set. For both Nonblocker and Max 
Leaf there are kernels with 0(k) vertices [BJ[TT] that have Q{k^) edges. Our lower bounds 
imply that the number of edges in these kernels cannot be improved to 0{k^~^), unless 
NP C coNP/poly. 

The final family of problems we consider is d-NOT-ALL-EQUAL-SAT for fixed d > 4. The 
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input consists of a formula in CNF form with at most d literals per clause. The question is 
whether there is an assignment to the variables such that each clause contains both a variable 
that evaluates to true and one that evaluates to false. There is a simple linear-parameter 
transformation from d-CNF-SAT to {d + 1)-NAE-SAT that consists of adding one variable that 
occurs as a positive literal in all clauses. By the results of Dell and van Melkebeek discussed 
above, this implies that d-NAE-SAT does not admit compressions of size unless 

NP C CONP/poly. We prove the surprising result that this lower bound is tight! A linear- 
algebraic result due to Lovasz ED], concerning the size of critically 3-chromatic d-uniform 
hypergraphs, can be used to give a kernel for d-NAE-SAX with clauses for every 

fixed d. The kernel is obtained by computing the basis of an associated matrix and removing 
the clauses that can be expressed as a linear combination of the basis clauses. 

Related work. Dell and Marx introduced the table structure for compression lower bounds [7| 
in their study of compression for packing problems. Hermelin and Wu m analyzed similar 
problems. Other papers about polynomial kernelization and sparsification lower bounds 
include [S] and mi. 

\~ 2 ] Preliminaries 

A parameterized problem Q is a subset of S* x N, where S is a finite alphabet. Let Q, Q' C 
S* X N be parameterized problems and let d: N —> N be a computable function. A generalized 
kernel for Q into Q' of size h{k) is an algorithm that, on input {x, k) gTi* x N, takes time 
polynomial in \x\ -I- k and outputs an instance {x', k') such that: 

1 . |a;'| and k' are bounded by h{k), and 

2. (x', k') G Q' if and only if (x, k) G Q. 

The algorithm is a kernel for Q if Q' = Q. It is a polynomial (generalized) kernel if h{k) is a 
polynomial. 

Since a polynomial-time reduction to an equivalent sparse instance yields a generalized 
kernel, we will use the concept of generalized kernels in the remainder of this paper to 
prove the non-existence of such sparsification algorithms. We employ the cross-composition 
framework by Bodlaender et al. [3] , which builds on earlier work by several authors PIHlIIDj. 

► Definition 1 (Polynomial equivalence relation). An equivalence relation TZ on S* is called a 
polynomial equivalence relation if the following conditions hold. 

1. There is an algorithm that, given two strings x,?/ G S*, decides whether x and y belong 
to the same equivalence class in time polynomial in |x| -I- \y\. 

2. For any finite set S' C S* the equivalence relation TZ partitions the elements of S into a 
number of classes that is polynomially bounded in the size of the largest element of S. 

► Definition 2 (Cross-composition). Let L C E* be a language, let TZ he & polynomial 
equivalence relation on S*, let Q C S* x N be a parameterized problem, and let /: N —>■ N 
be a function. An OR-cross-composition of L into Q (with respect to TZ) of cost f{f) is an 
algorithm that, given t instances xi, X 2 ,..., Xt G S* of L belonging to the same equivalence 
class of TZ, takes time polynomial in 1^*1 outputs an instance {y,k) G S* x N such 
that: 

1 . the parameter k is bounded by 0{f{t) ■ (max^ jxil)'^), where c is some constant independent 
of t, and 
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M Figure 1 Used gadgets with example colorings. 


2. {y, fc) G Q if and only if there is an i G [t] such that Xi G L. 

► Theorem 3 ([3]). Let L C S* be a language, let Q C "E* xN be a parameterized problem, and 

let d, £ be positive reals. If L is NP-hard under Karp reductions, has an OR- cross-composition 
into Q with cost f{t) = where t denotes the number of instances, and Q has a 

polynomial (generalized) kernelization with size bound then NP C coNP/poly. 

For r G N we will refer to an OR-cross-composition of cost f{t) = log(t) as a degree- 
r cross-composition. By Theorem a degree-r cross-composition can be used to rule 
out generalized kernels of size 0{k’^~^). We frequently use the fact that a polynomial¬ 
time linear-parameter transformation from problem Q to Q' implies that any general¬ 
ized kernelization lower bound for Q, also holds for Q' (cf. |31ll]). Let [r] be defined as 
[r] := {a: G N I 1 < a; < r}. 

[T] 4-Coloring 

In this section we analyze the 4-Coloring problem, which asks whether it is possible to 
assign each vertex of the input graph one out of 4 possible colors, such that there is no 
edge whose endpoints share the same color. We show that 4-Coloring does not have a 
generalized kernel of size by giving a degree-2 cross-composition from a tailor-made 

problem that will be introduced below. Before giving the construction, we first present and 
analyze some of the gadgets that will be needed. 

► Definition 4. A treegadget is the graph obtained from a complete binary tree by replacing 
each vertex r; by a triangle on vertices r„, Xy and yy. Let Xy be connected to the parent of v 
and let Xy and yy be connected to the left and right subtree of v. An example of a treegadget 
with 8 leaves is shown in Figure If vertex v is the root of the tree, then Xy is named the 
root of the treegadget. If v does not have a left subtree, then Xy is a leaf of this gadget, 
similarly, if v does not have a right subtree then we refer to yy as a leaf of the gadget. Let 
the height of a treegadget be equal to the height of its corresponding binary tree. 

It is easy to see that a treegadget is 3-colorable. The important property of this gadget 
is that if there is a color that does not appear on any leaf in a proper 3-coloring, then this 
must be the color of the root. See Figure [Ta] for an illustration. 

► Lemma 5. Let T he a treegadget with root r and let c: V(T) —> {1,2,3} be a proper 
3-coloring ofT. If k G {1,2,3} such that c{v) yf k for every leaf v of T, then c(r) = k. 

Proof. This will be proven using induction on the structure of a treegadget. For a single 
triangle, the result is obvious. Suppose we are given a treegadget of height h and that the 
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statement holds for all treegadgets of smaller height. Consider the top triangle r, x,y where 
r is the root. Then, by the induction hypothesis, the roots of the left and right subtree are 
colored using k. Hence x and y do not use color k. Since x, y, r is a triangle, r has color k in 
the 3-coloring. ◄ 


The following lemma will be used in the correctness proof of the cross-composition to 
argue that the existence of a single yes-input is sufficient for 4-colorability of the entire graph. 

► Lemma 6. Let T be a treegadget with leaves L C V{T) and root r. Any 3-coloring 
c': L —>■ {1, 2, 3} that is proper on T[L] can be extended to a proper 3-coloring of T. If there 
is a leaf V G L such that c'(v) = i, then such an extension exists with c{r) yf i. 


Proof. We will prove this by induction on the height of the treegadget. For a single triangle, 
the result is obvious. Suppose the lemma is true for all treegadgets up to height h — 1 and 
we are given a treegadget of height h with root triangle r,x,y and with coloring of the leaves 
c'. Let one of the leaves be colored using i. Without loss of generality assume this leaf is in 
the left subtree, which is connected to x. By the induction hypothesis, we can extend the 
coloring restricted to the leaves of the left subtree to a proper 3-coloring of the left subtree 
such that c(ri) i. We assign color i to x. Since d restricted to the leaves in the right 
subtree is a proper 3-coloring of the leaves in the right subtree, by induction we can extend 
that coloring to a proper 3-coloring of the right subtree. Suppose the root of this subtree 
gets color j G {1, 2,3}. We now color y with a color k G {1, 2,3} \ which must exist. 

Finally, choose c(r) G {1,2,3}\ {i, fc}. By definition, the vertices r, y, and x are now assigned 
a different color. Both x and y have a different color than the root of their corresponding 
subtree, thereby c is a proper coloring. We obtain that the defined coloring c is a proper 
coloring extending d with c(r) y^ i. ◄ 


► Definition 7. A triangular gadget is a graph on 12 vertices depicted in Figure Ic Vertices 
u,u, and w are the corners of the gadget, all other vertices are referred to as inner vertices. 


It is easy to see that a triangular gadget is always 3-colorable in such a way that every 
corner gets a different color. Moreover, we make the following observation. 

► Observation 8. Let G be a triangular gadget with corners u,v and w and let c: V{G) —>■ 
{1,2,3} be a proper 3-coloring ofG. Then c(v) y^ c(u) y^ c(w) y^ c(v). Furthermore, every 
partial coloring that assigns distinct colors to the three corners of a triangular gadget can be 
extended to a proper 3-coloring of the entire gadget. 


Having presented all the gadgets we use in our construction, we now define the source 
problem for the cross-composition. It is a variant of the problem that was used to prove 
kernel lower bounds for Chromatic Number parameterized by vertex cover [3]. 

2-3-Coloring with Triangle Split Decomposition 

Input: A graph G with a partition of its vertex set into X VJY such that G[X] is an 
edgeless graph and G\Y] is a disjoint union of triangles. 

Question: Is there a proper 3-coloring c : V{G) (1, 2,3} of G, such that c(x) G (1, 2} 
for all x G XI We will refer to such a coloring as a 2-3-coloring of G. 

► Lemma 9. 2-3-Coloring with Triangle Split Decomposition is NP-complete. 


Proof. It is easy to verify the problem is in NP. We will show that it is NP-hard by giving a 
reduction from 3-NAE-SAT, which is known to be NP-complete m- Suppose we are given 
formula F = Gi A G 2 A ... A Gm over set of variables U. Construct graph G in the following 
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(a) Gadget for a variable 
Figure 2 The gadgets constructed for the clauses and variables of F. 


way. For every variable x gU, construct a gadget as depicted in Figure For every clause 
Ci, construct a gadget as depicted in Figure 2b Let Ci = (£i V t '2 V £ 3 ) for i G [m], connect 
vertex £j for j G {1,2,3} to vertex vj in gadget Ci in G. 

It is easy to verify that G has a triangle split decomposition. In Figure triangles are 
shown with white vertices and the independent set is shown in black. 

Suppose G is 2-3-colorable with color function c : V{G) —>■ {1,2,3} and let c(v) G {1,2} 
for all V in the independent set. Note that in each of the pairs {x, {&i, 62 }, and {ui, M 2 } 

the two vertices have distinct colors in any proper 2-3-coloring of G. To satisfy F, let 
X = true if and only if c(x) =2. To show that this results in a satisfying assignment, 
consider any clause Ci for i G [m]. Note that c{x) = 2 c{-^x) = 1. Since c{bi) ^ 0 ( 62 ) 

and c( 6 i), 0 ( 62 ) € {1; 2} we obtain c(ro) = 3. Therefore, Vi and Uq are colored using colors 1 
and 2 . 

Suppose c{vi) = 1. Thereby, c(t'i) = 2, implying the first literal of Ci is set to true . By 
c(mo) = 2, we know c(mi) = 1 and c(m2) = 2. Thereby, c(ri) 7^ 2, so either c(m2) = 2 or 
c(m3) = 2. If c(m2) = 2, then c(t' 2 ) = 1 which implies that literal £2 is false in Ci . Similarly, 
if c(m3) = 2, then 0(^3) = 1 which implies that literal £3 is false in Ci . In both cases it 
follows that clause Ci is NAE-satisfied. 

When c(mi) = 2, we can use the same argument with the colors 1 and 2 swapped, to show 
that £i is false in Ci and £2 or £3 is true, which implies that Ci is NAE-satisfied. 

Suppose E is a ?/es-instance, with satisfying truth assignment S. Define color function 
c : V{G) -G {1,2,3} as c(x) := 1 and c(^x) := 2 if x is set to false in S, define c(x) := 2 
and c(-'x) ;= 1 otherwise. Color the remainder of the variable gadgets consistently. We now 
need to show how to color the clause gadgets. Consider any clause Ci = {£i V ^2 V £ 3 ). At 
least one of the literals is true and one is set to false, by symmetry we only consider four 
cases. The corresponding colorings are depicted in Figure where red corresponds to 1, 
green corresponds to 2 and blue corresponds to color 3. It is easy to verify that this leads to 
a proper 3-coloring that only uses colors 1 and 2 on vertices in the independent set. ◄ 


► Theorem 10. 4-Coloring parameterized by the number of vertices n does not have a 
generalized kernel of size for any e > 0, unless NP C coNP/poly. 

Proof. By Theorem and Lemma it suffices to give a degree-2 cross-composition from 
the 2-3-coloring problem defined above into 4-Coloring parameterized by the number of 
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M Figure 3 Valid colorings of a clause gadget, depending on the coloring of the literals £i,... ,£ 3 . 
Note that if the roles of £2 and £3 are exactly reversed, you can just exchange colors between their 
parents to get a proper coloring for that situation. 


vertices. For ease of presentation, we will actually give a cross-composition into the 4-List 
Coloring problem, whose input consists of a graph G and a list function that assigns every 
vertex v G V(G) a list L(v) C [4] of allowed colors. The question is whether there is a proper 
coloring of the graph in which every vertex is assigned a color from its list. The 4-List 
Coloring reduces to the ordinary 4-Coloring by a simple transformation that adds a 
4-clique to enforce the color lists, which will prove the theorem. For now, we focus on giving 
a cross-composition into 4-List Coloring. 

We start by defining a polynomial equivalence relation on inputs of 2-3-Coloring with 
Triangle Split Decomposition. Let two instances of 2-3-Coloring with Triangle 
Split Decomposition be equivalent under equivalence relation 7Z when they have the same 
number of triangles and the independent sets also have the same size. It is easy to see that 
7?. is a polynomial equivalence relation. By duplicating one of the inputs, we can ensure 
that the number of inputs to the cross-composition is an even power of two; this does not 
change the value of OR, and increases the total input size by at most a factor four. We will 
therefore assume that the input consists of t instances of 2-3-Coloring with Triangle 
Split Decomposition such that t = 2^® for some integer i, implying that \/i and log\/i 
are integers. Let t' := y/i. Enumerate the instances as Xij for 1 < *, j < C. Each input Wj- 
consists of a graph Gi^j and a partition of its vertex set into sets U and V, such that U is 
an independent set of size m and Gij[V] consists of n vertex-disjoint triangles. Enumerate 
the vertices in U and V as ui,..., Um and fi,..., T 3 „, such that vertices V 3 e- 2 ,V 3 £-i and vsi 
form a triangle, for £ G [n]. We will create an instance G' of the 4-List-Coloring problem, 
which consists of a graph G' and a list function L that assigns each vertex a subset of the 
color palette {x,y,z,a}. Refer to Figure |^for a sketch of G'. 

1. Initialize G' as the graph containing t' sets of m vertices each, called Si for i G [t']. Label 
the vertices in each of these sets as for i G [f'], £ G [m] and let L(s^) ;= {x,y,a}. 

2. Add t' sets of n triangular gadgets each, labeled Tj for j G [t']. Label the corner vertices 

in Tj as for £ G [3n], such that vertices t^ 3 £_ 2 i ^ 3 £_i ^31 ^re the corner vertices of one 

of the gadgets for £ G [n]. Let L{t\) := {x, y, z} and for any inner vertex u of a triangular 
gadget, let L{v) := {x,y,z,a}. 

3. Connect vertex s\. to vertex if in graph Gij vertex Uk is connected to vi, for k G [m] 
and £ G [3n]. By this construction, the subgraph of G' induced by Si U Tj is isomorphic 
to the graph obtained from Gij by replacing each triangle with a triangular gadget. 
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M Figure 4 The graph G' for t' = A, m = ?> and n = 2. Edges between vertices in S and T are left 
out for simplicity. 


4. Add a treegadget Gs with t' leaves to G' and enumerate these leaves as 1,... recall 
that t' is a power of two. Connect the Tth leaf of Gs to every vertex in Si. Let the root of 
Gs be rs and define L{rs) '■= {a;, ?/}• For every other vertex v in Gs let L{v) ;= {cc, y, a}. 

5. Add a treegadget Gt with 2t' leaves to G' and enumerate these leaves as 1,..., 2t'. For 
j G [t^], connect every inner vertex of a triangular gadget in group Tj to leaf number 
2j — 1 of Gt- For every leaf v with an even index let L{v) := {y, z\ and let the root 
have list L{rT) ■= {y,z}. For every other vertex v of gadget Gt let L{v) := {y,z,a}. 

► Claim 11. The graph G' is A-list-colorable some input instance Xi-j* is 2-S-colorable. 

Proof. (^) Suppose we are given a 4-list coloring c for G'. By definition, c{rs) a. From 
Lemma it follows that there is a leaf v of Gs such that c{v) = a. This leaf is connected to 
all vertices in some 5'^*, which implies that none of the vertices in Si* are colored using a. 
Therefore all vertices in Si* are colored using x and y. Similarly the gadget Gt has at least 
one leaf v such that c{v) = a, note that this must be a leaf with an odd index. Therefore 
there exists Tj* where all vertices are colored using x,y or z. Thereby in Si* U Tj* only three 
colors are used, such that Si* is colored using only two colors. Using Observation and the 
fact that G'[Si* U Tj*] is isomorphic to the graph obtained from Gi*j* by replacing triangles 
by triangular gadgets, we conclude that Xi*j* has a proper 2-3-coloring. 

(<t=) Suppose c: V(Gi*j*) —>■ {x, y, z} is a proper 2-3-coloring for Xi*j*. We will construct 
a 4-list coloring c': V{G') —>■ {x,y,z,a} for G'. For Uk, k G [m] in instance Xi*j* let 
c'(s^ ) •= c(ufe) and for ve for £ G [3n] let c'{t^ ) := c{ve). Let c'(s^) := a for i i* and 
£ G [n], furthermore let c'{t\) := z for j y^ j* and £ G [3m]. For triangular gadgets in Tj* the 
coloring c' defines all corners to have distinct colors; by Observation we can color the inner 
vertices consistently using {x,y,z}. For Tj with j G [t'] and j y^ j*, the corners of triangular 
gadgets have color z and we can now consistently color the inner vertices using {x,y,a}. 

The leaf of gadget Gs that is connected to Si* can be colored using a. Every other leaf 
can use both x and y, so we can properly 3-color the leaves such that one leaf has color a. 
From Lemma it follows that we can consistently 3-color Gs such that the root rs does not 
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receive color a, as required by Lijs)- Similarly, in triangular gadgets in Tj* the inner vertices 
do not have color a. As such, leaf 2j* — 1 of Gt can be colored using a and we color leaf 2j* 
with y. For j G \t'] with j ^ j* color leaf 2j — 1 with z and leaf 2j using y. Now the leaves 
of Gt are properly 3-colored and one is colored a. It follows from Observation that we can 
color Gt such that the root is not colored a. This completes the 4-list coloring of G". ◄ 


The claim shows that the construction serves as a cross-composition into 4-List Coloring. 
To prove the theorem, we add four new vertices to simulate the list function. Add a clique 
on 4 vertices {x,y, z,a}. If for any vertex v in G', some color is not contained in L(y), 
connect v to the vertex corresponding to this color. As proper colorings of the resulting graph 
correspond to proper list colorings of G', the resulting graph is 4-colorable if and only if 
there is a i/es-instance among the inputs. It remains to bound the parameter of the problem, 
i.e., the number of vertices. Observe that a treegadget has at least as many leaves as its 
corresponding binary tree, therefore the graph G' has at most 12mt' + nt' + 12t' -I- 4 = 
0(t' ■ {m + n)) = max |Xij|) vertices. Theorem 10 now follows from Theoremj^and 
Lemma (9] ◄ 


Hamiltonian cycle 


In this section we prove a sparsification lower bound for Hamiltonian Cycle and its directed 
variant by giving a degree-2 cross-composition. The starting problem is Hamiltonian s — t 
PATH ON BIPARTITE GRAPHS. 

Hamiltonian s — t path on bipartite graphs 

Input: An undirected bipartite graph G with partite sets A and B such that \B\ = n = 
|A| -1-1, together with two distinguished vertices &i and bn that have degree 1. 

Question: Does G have a Hamiltonian path from bi to 6„? 

It is known that Hamiltonian path is NP-complete on bipartite graphs m and it is easy 
to see that is remains NP-complete when fixing a degree 1 start and endpoint. 

► Theorem 12. (Directed) Hamiltonian Cycle parameterized by the number of vertices 
n does not have a generalized kernel of size for any e > 0, unless NP C coNP/poly. 


Proof. By a suitable choice of polynomial equivalence relation, and by padding the number 
of inputs, it suffices to give a cross-composition from the s — t problem on bipartite graphs 
when the input consists of t instances Xij for i,j G [Vi] (i.e., \/t is an integer), such that each 
instance Xi j encodes a bipartite graph Gi j with partite sets A*^ and B*^ with \A*^ \ = m 

as b 


for some m G N. For each instance, label all elements in A*^ as 


and \B*j\ = n = m + 1, 
a);,..., a))j and all elements in Bfj as 6 ^,..., 5* such that b^ and 6* have degree 1. 

The construction makes extensive use of the path gadget depicted in Figure]^ Observe 
that if G' contains a path gadget as an induced subgraph, while the remainder of the graph 
only connects to its terminals IN° and IN^, then any Hamiltonian cycle in G' traverses the 
path gadget in one of the two ways depicted in Figure 5a We create an instance G' of 


Directed Hamiltonian Cycle that acts as the logical or of the inputs. 


1. First of all construct -s/t groups of m path gadgets each. Refer to these groups as A^, for 
i € [Vi], and label the gadgets within group as a^,..., a)„. Let the union of all created 
sets Ai be named A. Similarly, construct Vi groups of n path gadgets each. Refer to 
these groups as Bj, for j G [Vi], and label the gadgets within group Bj as ..., 6^. Let 
B be the union of all Bj for j G [Vi] ■ 
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(a) A path gadget. (b) The general structure of the created graph, when given 4 inputs 
with n = 3 and m = A. 

M Figure 5 Illustrations for the lower bound for Hamiltonian Cycle. 


2. For every input instance Xij, for each edge {a^, 6 |} in Xij with k G [to], £ G [n], add an 
arc from IN° of to IN^ of bj and an arc from IN° of to IN^ of a^. 

If some Xi j has a Hamiltonian s — t path, it can be mimicked by the combination of and 
Bj, where for each vertex in X^ j we traverse its path gadget in G', following Path 1. The 
following construction steps are needed to extend such a path to a Hamiltonian cycle in G'. 


3. Add an arc from the IN^ terminal of to the IN° terminal of for all £ G [to — 1] and 
all i G [Vi]- Similarly add an arc from the IN^ terminal of bg to the IN° of for all 
£ G [rr — 1] and all i G [Vi] ■ 

4. Add a vertex start and a vertex end and the arc (end, start). 

5 . Let r ■.= Vi — add 2r tuples of vertices, Xi,yi for i G [2r] and connect start to xi. 
Furthermore, add the arcs (i/i,Xi+i) for i G [2r — 1]. 

6 . For i < r we add arcs from xi to the in '^ terminal of the gadgets a\,j G [Vi]- Furthermore 
we add an arc from IN^ of to yi for all j G [Vi] and i G [rj. When i > r add arcs from 
Xi to the IN° terminal of b{ for j G [Vi] and connect IN^ of to yi. 

7. Add a vertex next and the arc (j/2r, next) and an arc from next to the IN^ terminal of 
all gadgets for j G [Vi]- 

8 . Furthermore, add arcs from IN° of all gadgets 6^ to end for j G [Vi]- So for each Bj, 
exactly one vertex has an outgoing arc to end and one has an incoming arc from next. 


This completes the construction of G". A sketch of G' is shown in Figure 5b In order to 
prove that the created graph G' acts as a logical OR of the given input instances, we first 
establish a number of auxiliary lemmas. 


► Lemma 13. Any Hamiltonian cycle in G' traverses any path gadget in G' via directed 
Path 0 or Path 1, as shown in Figure [5^ 


Proof. Any Hamiltonian cycle in G' should visit the center vertex of the path gadget. Since 
IN° and IN^ are its only two neighbors in G", the only option is to visit them consecutively, 
Path 0 and Path 1 are the only two options to do this. ◄ 
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► Lemma 14. When any Hamiltonian cycle in G' enters path gadget a\ at iNq for some i G 
[y/i], the cycle then visits the gadgets a|, 03 ,..., in order without visiting other vertices 
in between. Similarly, if any Hamiltonian cycle in G' enters path gadget 63 at iNq, the cycle 
then visits the gadgets 63 , 63 ,..., 6 :^ in order without visiting other vertices in between. 


13 


Proof. Consider a Hamiltonian cycle in G" that enters path gadget a\ at iNq. By Lemma 
the cycle follows Path 0 and continues to the IN^ terminal of the path gadget. Since that 
terminal has only one out-neighbor outside the gadget, which leads to the iNg terminal of 03, 
it follows that the cycle continues to that path gadget. As the adjacency structure around 
the other path gadgets is similar, the lemma follows by repeating this argument. The proof 
when entering group Bj at the vertex iNq of b{ is equivalent. ◄ 


► Lemma 15. Let G be a directed Hamiltonian cycle in G', such that its first arc is 
(start, xi). There are indices i*,j* G ['/i] such that subpath of the cycle between Xx 

and y 2 r contains exactly the vertices 


Ai. U Bj* U {xi,yi \ i G [2r]} 

where Ai* contains all vertices of all gadgets in Ai for i ^ i*, and similarly Bj* contains all 
vertices of all gadgets in Bj for j ^ j*. 


Proof. We will first show that when the cycle reaches any Xi for i G [r], it traverses exactly 
one group Ai with i G [c -f 1] and continues to pj and Xj+i for some j G [r], without visiting 
other vertices in between. Similarly, when the cycle reaches any Xi for r < i < 2r, it traverses 
exactly one group B^ with i G [r" -b 1] and continues to pj for some r < j < 2r. For j < 2r, 
the cycle then continues to Xj+i, for j = 2r the cycle reached j/ 2 rj which is the last vertex of 
this subpath. 

By Step in the construction, all outgoing arcs of any Xi for i G [r] lead to gadgets a{ 
for some £ G [Vi]- So for any Xi in the cycle there must be a unique £ G [Vi] such that the 
arc from Xi to the IN*^ terminal of af is in C. By Lemma 14 the cycle visits all vertices in Ai, 


and no other vertices, before reaching gadget a^, which is traversed by Path 0 to get to IN^ 
of this gadget. The only neighbors of IN^ of gadget lying outside this gadget are of type 
Pj for j G [r]. As such, the cycle must visit some pj next, and its only outgoing arc goes to 


Xj + i. 

The proof for i > r is similar. As such, visiting Xi for i G [r] results in visiting all vertices 
of exactly one group in A before continuing via pj to some Xj+i without visiting any vertices 
in between. Visiting Xi for r < i <2r results in visiting all vertices of exactly one group in 
B and returning via pj to either the end of the subpath (j = 2r) or some Xj+i. 

Every vertex Xi for i G [2r] must be visited by G, it remains to show that it is visited 
in subpath Gxi,y 2 r- Suppose there exists an Xi for i G [2r] such that Xi is not visited in the 
subpath from Xi to p 2 r- As we have seen above, visiting some Xi results in visiting all vertices 
in some group in A or B, continued by visiting some pj for j G [2r]. Note that no other 
vertices are visited in between. Hereby, pj is not in subpath Gxi,y 2 ,.- This implies j yb 2r 
and thus the next vertex in the cycle is Xj+i. So, for Xi not in subpath Gxi,y 2 .rJ can find 
a new vertex Xj+i (where j + 1 ^ i), such that Xj+i is also not in subpath Gxi,y 2 ,.- Note 
that we can not create a loop, by visiting a vertex Xi seen earlier, as this would not yield 
a Hamiltonian cycle in G'. For example, the vertex start would never be visited. This is 
however a contradiction since we only have finitely many vertices Xi. 

Thus in subpath exactly r groups of A are visited and exactly r groups of B 

are visited, and no other vertices than specified. This leaves exactly one group Ai* and one 
group Bj* unvisited in Gxi,y 2 . 


-4 
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In Step 1^ we create a selection mechanism that leaves one group in A and one in B 
unvisited. The following lemma formalizes this idea. 

► Lemma 16. Let C be a Hamiltonian cy cle i n G', such that its first arc is (start, xi). Let 


i* and j* satisfy the conditions of Lemma 15 Then cycle C visits b{ before . Moreover, 


the subpath of the cycle C^j* between terminal IN^ ofb{ and in® o/5^ (inclusive) contains 
all vertices of the gadgets in Ai* and Bj* and no others. 

Proof. Vertex next is visited directly after y 2 r, since it is the only out-neighbor of j/ 2 r- 
Furthermore, the arc from next to gadget b{ must be in the cycle for some £ € [v^], since 


NEXT only has outgoing arcs of this type. By Lemma 15 all gadgets in all Bj for j ^ j* 
are visited in the path from xi to y 2 r, and thus should not be visited after vertex next. 
Therefore, the arc from next to gadget b\ is in the cycle, which also implies that is 
visited before 6^ . 

It is easy to see that (end, start) is the last arc in C. By considering the incoming arcs 
of END it follows that some arc from terminal IN® of bj, to end for £ G ['/i] is in the cycle. 
Since the vertices in gadgets 6^ for £ ^ j* are already visited in by Lemma 15, it 

follows that (6^* , end) is in C. 

By Lemma [T5| none of the terminals of gadgets in and Bj» are visited in the subpath 


^a:i,y 2 r or equivalently in the subpath Cstart.next- Since C is a Hamiltonian cycle these 
vertices must therefore be visited in Cnext,start; which is equivalent to saying that 
must contain all vertices in U Bj*. It is easy to see that this subpath cannot contain any 


other vertices, as all other vertices are present in C. 


START, NEXT 


or C, 


END,START- 


Using the lemmas above, we can now prove that G" has a Hamiltonian cycle if and only 
if one of the instances has a Hamiltonian path. 

► Lemma 17. Graph G' has a directed Hamiltonian cycle if and only if at least one of the 
instances Xij has a Hamiltonian s — t-path. 


Proof. (<t=) Suppose G' has a Hamiltonian cycle G. By Lemma 16 there exist i*,j* G [Vt] 
such that the subpath of G from gadget b\ to 5^ visits exactly the gadgets in Ai» . Since 
gadget is entered at terminal IN^, it is easy to see that all gadgets are traversed using Path 
1. We now construct a Hamiltonian path P for instance Let {a%{i*,j*), b){i*,j*)} G P 

if the arc from in® of to IN^ of is in C. Similarly let {b*j.{i*, j*), a*f^{i*, j*)} G P if the 
arc from IN® of b) to IN^ of a], is in C, where k G [m] and £ G [n\. Using that every gadget 
is visited exactly once via Path 1 in G, we see that G is a Hamiltonian path. 

(=I>) Suppose Xi»jt has a Hamiltonian s — t path P. Then we create a Hamiltonian cycle 
G, for each vertex a| from instance in P we add Path 1 in path gadget to G and 

for each vertex we add Path 1 in path gadget b). to G. Let P be ordered such that is 
its first vertex. Now if is followed by in P, the arc from terminal in® of a ], to IN^ of 
b ) is added to G. Similarly, if a vertex is followed by in P, the arc from terminal in® 
of b) to IN^ of will be added to G. Now the subpath contains all terminals in 

all gadgets in Ai* U Bj». 

From 5^ the cycle goes to end, then to start and to xi. To visit all groups Ai for i ^ i* 
and Bj for j ^ j*, do the following. 


From Xi where i < i*, the cycle continues to gadgets a\, then to a\,a\,... ,a\^ 
Path 0, and continue to yi,Xi+i. 


following 


From Xi where i* < i < r it goes to a\ 


i+l 


,and continues with yi,Xi+i. 
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H Similarly, from Xi where r < i < j*, go through gadgets b\,... and continue to 

Ui-i ^i+1- 

B From Xi where j* < i < 2r, go to gadgets ..., and continue to yi, for i ^ 2r 
then add the arc {yi^Xi^i). 

From y2r, continue to next, after which the arc (next, b{ ) closes the cycle. By definition, 
no vertex is visited twice, so it remains to check that every vertex of G' is in the cycle. For 
vertices start, next, end and all vertices Xi,yi,Zi this is obvious. All vertices in Ai and Bj 
where i ^ i* and j ^ j* are in the cycle between some Xi and yi. All vertices in Ai* and 
Bj* are visited since P was a Hamiltonian path on these vertices. ◄ 


The number of vertices of G' is 3(m + n)\/t + 3 • 2{\^ — 1) + 3 = 0{y/t • (m + n)) = 
0{Vi ■ ma.x\Xij\). By with Lemma 17 the construction is a degree-2 cross-composition 


from Hamiltonian s — Apaths in Bipartite graphs to Directed Hamiltonian cycle 
parameterized by the number of vertices, proving the generalized kernel lower bound for the 
directed problem. Karp m gave a polynomial-time reduction that, given an n-vertex directed 
graph G, produces an undirected graph G' with 3n vertices such that G has a directed 
Hamiltonian cycle if and only if G' has a Hamiltonian cycle. This is a linear parameter 
transformation from Directed Hamiltonian cycle to Hamiltonian cycle. Since 
linear-parameter transformations transfer lower bounds 12111], we conclude that (Directed) 
Hamiltonian cycle does not have a generalized kernel of size 0{n^~^) for any e > 0. ◄ 


[~5] Dominating set 

In this section we discuss the Dominating Set problem and its variants. Dom et al. [S] 
proved several kernelization lower bounds for the variant Red-Blue Dominating Set, 
which is the variant on bipartite (red/blue colored) graphs in which the goal is to dominate 
all the blue vertices by selecting a small subset of red vertices. Using ideas from their kernel 
lower bounds for the parameterization by either the number of red or the number of blue 
vertices, we prove sparsification lower bounds for (Connected) Dominating Set. Since we 
parameterize by the number of vertices, the same lower bounds apply to the dual problems 
Nonblocker and Max Leaf Spanning Tree. 

We will prove these sparsification lower bounds using a degree-2 cross-composition, starting 
from a variation of the Colored Red-Blue Dominating Set problem (Col-RBDS) as 
described by Dom et al. in [n|. 

Equal-Sized Colored Red/Blue Dominating Set (Eq-Col-RBDS) 

Input: A bipartite graph G = {RU B,E), where R is partitioned into k subsets 
such that |Ri| = |i? 2 | = ... = \Rk\- 

Question: Is there a set S' C i? such that for each i G [fc] the set S contains exactly one 
vertex of Ri and every vertex in B is adjacent to at least one vertex from S. 

We will think of the vertices in set Ri as having color i. Hence the question is whether 
there is a set SCR containing exactly one vertex of each color, such that every vertex in B 
is adjacent to at least one vertex in S. 

► Lemma 18. eq-Col-RBDS is NP-complete. 

Proof. Dom et al. [5] proved the NP-completeness of Colored RBDS without the constraint 
that all color sets have equal size. The NP-completeness for the equal-sized version follows 
from the fact that we may repeatedly add isolated vertices to classes Ri that are too small, 
without changing the answer. ◄ 




14 


Sparsification Upper and Lower Bounds for Graphs Problems and Not-AII-Equal SAT 


Using this result, we can now give a degree-2 cross-composition and prove the following. 

► Theorem 19. (Connected) Dominating Set, Nonblocker, and Max Leaf Span¬ 
ning Tree parameterized by the number of vertices n do not have a generalized kernel of 
size 0(n^~^) for any e > 0, unless NP C coNP/poly. 

Proof. A graph has a nonblocker of size k if and only if it has a dominating set of size 
n — k. Furthermore, the Maximum Leaf Spanning Tree problem is strongly related to 
Connected Dominating Set. The internal vertices of any spanning tree form a connected 
dominating set. Conversely, any connected dominating set contains a subtree spanning the 
dominating set, which - by the domination property - can be greedily extended to a spanning 
tree for the entire graph in which the remaining vertices are leaves. Hence a graph has a 
connected dominating set of size at most k if and only if it has a spanning tree with at least 
n — k leaves. Therefore we will show this result for (Connected) Dominating Set only. 

Define a polynomial equivalence relation TZ on instances of eq-Col-RBDS by first of all 
letting all instances where there is a vertex in B of degree 0 be in the same class, note that 
these are always no-instances. Let 2 instances (G = {RU B), k) and (G' = {R' U B'), k') of 
eq-Col-RBDS be equivalent if |i?| = \R'\ , \B\ = \B'\ and k = k'. It is easy to see that TZ 
indeed is a polynomial equivalence relation. 

Suppose we are given t instances of eq-Col-RBDS, such that ^/i and log -s/t G N and 
such that all given instances are in the same equivalence class of TZ. Let t' := ^/i. If these 
instances are from the class where B contains a vertex of degree 0, output a constant size 
no-instance. 

Otherwise, label the given instances as Xij with i,j G [f]. Let instance Xij have graph 
Gij, which is bipartite with vertex set R*j U B*y Let | = m and \B*j\ = n and let R* j 
be partitioned into k color classes for all i,jG [f] and p G [k]. Label all vertices in 
as r* q{i,j) with p G [k] and q G [m/k], which means that this vertex is the g’th vertex of 
color p from instance Xij. Label vertices in B*j as 5*(z, j), ..., b"f{i,j) arbitrarily. We now 
create an instance (G, k) for Dominating Set using the following steps. A sketch of G can 
be found in Figure 

1. Add vertices ^ for p G G [m/k] and i G [t']. The dominating set problem does 
not use colored instances, however we will remember the color of these vertices for 
simplicity. Let vertex have color p, for i G [t'], q G [m/k] and p G [fc]. Define 
Ri := {Vp^g I p G [k],q G [m/k]} and let R := UiG[t'] Give every set Ri a unique 
identifier iD(i?i), which is a subset of A := 2 + k + logt' numbers in the range [2K]. 

2. Add vertices for £ G [n] and j G [t']. Define Bj and B a,s Bj := {6;^ | ^ G [n]} and 

3. Add edges between the vertices Up g and for p G [k],q G [m/k] and i,j G [t'] if rp g{i,j) 
is connected to b\{i,j) in instance Xij. This ensures that the graph induced by Ri U Bj 
is exactly Gij^ and the coloring of vertices in Ri matches the coloring of R* j. 

4. Add vertices s' and s and edge {s', s}. Furthermore, add edges between s and all vertices 
in R. The degree-1 vertex s' ensures there is a minimum dominating set containing s, 
which covers all vertices in R “for free”. 

5. In a similar way as given by Dom et al. in [S], for every pair of colors (ci, C 2 ) G {1,..., fc} x 

{!,...,/c} with Cl yf C 2 we add a vertex set W(^ci,c 2 ) = i 

X G [2K] connect to all vertices of color ci in Ri if x € iD(Ri), otherwise connect 

^(ci.c 2 ) vertices of color C 2 in Ri. This construction is used to choose which Ri is 

part of a solvable input instance Xij for some j G [t']. This idea is formalized in Lemmas 
[22] and EH 
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ID(i?i) = {1} 



R B 

M Figure 6 A sketch of G, where t' = 2,m = 6 ,n = b and k = 2. Thereby K should be 5 and 
TT(ci,c 2 ) should contain 10 vertices. In this example we show the constructed graph when choosing 
K = 1 for simplicity. We use the two colors ci and C 2 , corresponding to white and black in the 
figure. Edges from R to B are left out for simplicity. 


6 . Then, add logf' triangles, with vertices for £ € [logt']. Connect to all 

vertices in Bj if the tth bit of j equals 0, connect to all vertices in Bj if the till bit of 
j equals 1. Define T to be the union of all these triangles. By choosing exactly one of the 
vertices or t\ in a dominating set for each £, all groups Bj except one are dominated 
automatically. The non-dominated one should then be part of a solvable input instance. 

7. Finally, add the edges {{s, t\} \ £ & [log t'],i G {0,1}}. This step ensures that every vertex 
in T that is contained in the dominating set has s as a neighbor in the dominating set, 
which implies that there is always a minimum dominating set that is connected. 

We now make the following observations. 

► Lemma 20. If G has a dominating set D, then it also has a dominating set D' of size at 
most \D\ that does not eontain any vertices from B. 

Proof. Suppose we are given a minimum dominating set D of G, where vertex v € B is 
present. In any dominating set, s or s' must be present. If s' is present and s is not, we 
replace s' by vertex s, and still obtain a valid dominating set of the same size. As such, all 
vertices in R are now dominated by s. Vertices t^ and with i G [logt'] are dominated by s. 
Since fj only has neighbors t] and at least one of these three vertices is present in D for 
every £ G [logt'], hereby every vertex in T has a neighbor in D. 

Since B is an independent set in G, the vertex v does not dominate other vertices in B. 
Since the polynomial equivalence relation ensures that there are no isolated vertices in B, 
vertex v has at least one neighbor u in R. We can safely replace a by u to obtain a valid 
dominating set that has the same size as D and does not contain any vertices from B. ◄ 

► Lemma 21. Any dominating set of G of size at most fc+l + logt' contains at least 1 + logt' 

vertices from {s,s'} U | £ G [logt']} and thus contains at most k vertices from R. 

Proof. In a dominating set D of G, at least logt' vertices are needed from T, since tf only has 
neighbors tj and so one of these vertices must be in D for each £ G [logt']. Furthermore 






















16 


Sparsification Upper and Lower Bounds for Graphs Problems and Not-AII-Equal SAT 


at least one of the vertices s' or s must be present, therefore there are 1 + log t' vertices in 
the set that are not from R. ◄ 

► Lemma 22. Any dominating set of G of size at most k + 1 + log t' uses exactly one vertex 
of each color from R. 

Proof. Suppose a dominating set of G of size at most A: + 1 + log t' uses less than k colors 
from R. If at most k — 2 colors are used, there must be two colors ci and C 2 that are not 
present in the set. However, this implies that all 2K vertices in are not dominated 

by vertices in R and must therefore be in the set. This contradicts the maximum size of 
the dominating set, since K = k + 2 + logt'. So, we are left with the possibility of using 
k — 1 colors. Consider some color ci that was not used. Look at another color C 2 that is 
used exactly once, such a color exists by Lemma Suppose the vertex of color C 2 in the 
dominating set was from set Ri for some i G [t']. Then for any x G iD(i?i) we have that 
^^ci,c 2 ) connected to any vertex in the dominating set and therefore must be in the 

dominating set itself. Since iD(i?i) contains K numbers, there are K vertices that are not 
dominated by i?, which contradicts the maximum size of the dominating set. ◄ 

► Lemma 23. For any dominating set D of G of size at most A: + 1 + log A', there exists 
i G [t'] such that all vertices in D f] R are contained in set Ri. 

Proof. Suppose there exists two vertices u,v G D such that u G Ri and v G Rj for some 
i ^ j. By Lemma [22| u and v have different colors. Suppose u has color c„ and v has color 
Cy. Since Ri ^ Rj, there exists x G [2/f] such that x G iD(i?i) and x ^ iD(i?j). By Stepl^of 
the construction, this means that none of the neighbors of vertex wf are contained in 
the dominating set. However, this vertex is not in D and therefore D is not a dominating set 
of G, which is a contradiction. ◄ 

Using the previous Lemmas, we obtain: 

► Lemma 24. 1 . If there is an input Xi^j* that has a col-RBDS of size k, then G' has a 

connected dominating set of size A; + 1 + log A'. 

2. If G' has a (not necessarily connected) dominating set of size A; + 1 + logA', then some 
input Xi-jt has a col-RBDS of size k. 

Proof. (1) Let Xi*j* have a colored RBDS D of size at most k, then we can construct a 
dominating set D' of G in the following way. For any vertex r* ^ in D, add vertex r* ^ to D'. 

Furthermore add the vertex s to D'. Then add vertex A° to D' if the q’th bit of j* is 1, 
add vertex t) otherwise. Now s' is dominated and all vertices in R have neighbor s in D'. 
All vertices in Bj* are covered by the vertices in the dominating set from Ri -, since D was 
a col-RBDS of Xi*j-. All vertices in Bj for j ^ j* have neighbor A° or t) in D' for some 
£ G [log A'], since the bit representation of j must differ from the one of j* at some position. It 
now follows from Step of the construction that all vertices in Bj are connected to a vertex 
in the dominating set. It remains to verify that all vertices in W have a neighbor in D'. 
Consider for x G [^K] and ci,C 2 G [A:]. If a; G iD{Ri*), then this vertex is connected 

to all vertices of color ci and exactly one of them is contained in D'. li x ^ iD(Ri.), the 
vertex is connected to all vertices of color C 2 in Ri* and again one vertex of this 

color in Ri* is contained in D'. So D' is a dominating set of G and it is easy to verify that 
ID'I = A: -I- 1 -I- log A'. Furthermore, D' is constructed in such a way that it is connected. We 
can show this by proving that every vertex in D' is a neighbor of s, since we chose s in D. 
Vertices in D' f] R and D' f]T are neighbors of s, by Steps and of the construction of G. 
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The vertex s' and vertices from W and B are not contained in D'. Thus, D' is a connected 
dominating set. 

(2) Let D' be a dominating set of G of size at most A: + 1 + logt'. Using Lemma 20 


we 


modify D' such that it chooses no vertices from B, without increasing its size. By Lemma 


22 and 23 D' contains exactly k vertices from R, all from the same Ri* for some i* and all 


of different color. D' has size at most fc + 1 + log t of which k are contained in R and one in 
{s, s'}. Combined with the fact that for any i S [logt'] vertex has t] and as its only 
two neighbors, it follows that exactly one of these three vertices is contained in D' for all £. 
Therefore D' contains at most one of the vertices or for every £ G [log t']. 

We can now define xe G {0,1} for £ G [logt'], such that ^ D' for all £ G [logt']. 
Consider the index j* G [t] given by the binary representation [xi X 2 ■ ■ ■ xiogt>] 2 - It follows 
from the bit representation of j* that the vertices in Bj» are not connected to any of the 
vertices in D' C T. Since vertices in Bj» are only adjacent to vertices in R and vertices of T, 
it follows that every vertex in Bj* has a neighbor in R that is in D'. This implies that every 
vertex in Bj* has a neighbor in D' n Ri*. Since G\Ri- U Bj*] is isomorphic to the graph of 
instance it follows that Xi> j» has a col-RBDS of size at most k, which are exactly 

the vertices in D' n i?,*. ◄ 


Given t instances, the graph G constructed above has n-t' + m-t' + 2 + 3-logt' + 2( 2 ) • 2iL = 
©(•s/tmax \Xij\'^) vertices. It is straightforward to construct G in polynomial time. It follows 
from Lemma [24| that G has a dominating set of size A: + 1 + log A', if and only if one of the 
input instances has a col-RBDS of size k. Furthermore, G has a connected dominating set of 
size k+l~\- log n if and only if one of the input instances has a col-RBDS of size k. Therefore 
we have given a degree-2 cross-composition to (Connected) Dominating Set. Using 
Theorem [ 3 ] it follows that Dominating Set and Connected Dominating Set do not 
have a generalized kernel of size for any £ > 0, unless NP C coNP/poly. ◄ 


Just as the sparsification lower bounds for Vertex Cover that were presented by Dell 
and van Melkebeek [5] had implications for the parameterization by the solution size k, 
Theorem 19 has implications for the kernelization complexity of A:-Nonblocker and A;-Max 
Leaf. Since the solution size k never exceeds the number of vertices in this problem, a kernel 
with 0{kF‘~'') edges would give a nontrivial sparsification, contradicting Theorem 
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Hence 

our results show that the existing linear-vertex kernels for A:-Nonblocker [5] and A:-Max 
Leaf m cannot be improved to 0{k'^ edges unless NP C coNP/poly. 


6 


(i-Hypergraph 2-Colorability and (i-NAE-SAT 


The goal of this section is to give a nontrivial sparsification algorithm for NAE-SAT and prove 
a matching lower bound. For ease of presentation, we start by analyzing the closely related 
hypergraph 2-colorability problem. Recall that a hypergraph consists of a vertex set V and a 
set E of hyperedges; each hyperedge e G U is a subset of V. A 2-coloring of a hypergraph is 
a function c: U —>■ {1, 2}; such a coloring is proper if there is no hyperedge whose vertices all 
obtain the same color. We will use cA-Hypergraph 2-Colorability to refer to the setting 
where hyperedges have size at most d. The corresponding decision problem asks, given a 
hypergraph, whether it is 2-colorable. 

► Theorem 25. d-HYPERGRAPH 2 -Colorability parameterized by the number of vertices n 
has a kernel with 2 • n'^~^ hyperedges that can be encoded in ■ d ■ logn) bits. 
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Sparsification Upper and Lower Bounds for Graphs Problems and Not-AII-Equal SAT 


Proof. Suppose we are given a hypergraph with vertex set V and hyperedges B, where each 
hyperedge contains at most d vertices. We show how to reduce the number of hyperedges 
without changing the 2-colorability status. Let Er C E denote the set of edges in E that 
contain exactly r vertices. For each Er we construct a set E'^. C Er of representative 
hyperedges. Enumerate the edges in E,. as e^,..., e^. We construct a (0, l)-matrix Mr with 
N := (r^i) rows and k columns. Consider all possible subsets Ai,... ,A]\[ of size r — 1 of 
the set of vertices V. Define the elements for i G N and j G k of Mr as follows. 


rriij 


1 if^^CeJ; 
0 otherwise. 


Using Gaussian elimination, compute a basis B of the columns of this matrix, which is a 
subset of the columns that span the column space of Mr- Let E' contain edge e[ if the i’th 
column of Mr is contained in B, and define E' := lJr6[d] ^ri which forms the kernel. Using 
a lemma due to Lovasz [^, we can prove that E' preserves the 2-colorability status. 

► Lemma 26 1(20]). Let E[ be an r-uniform hypergraph with edges Ei,...,Em- Let 
ai,..., am be real numbers such that for every (r — l)-element subset AofV(H), 

ai = 0. 

EiDA 

Then for every partition {Vi,U 2 } ofV{H) the following holds: 

a, = (-1)’' ai. 

EiCVi Ei<ZV2 

Now we can prove the correctness of the presented kernel. 

► Lemma 27. {V,E) has a proper 2-coloring (V,E') has a proper 2-coloring. 

Proof. (=^>) Clearly, if (V, E) has a proper 2-coloring, then the same coloring is proper for 
the subhypergraph {V,E') since E' C E. 

(<t=) Now suppose {V,E') has a proper 2-coloring. We will show that for each r G [d], 
no edge of Er is monochromatic under this coloring. All hyperedges contained in U' are 
2-colored by definition. Suppose there exists r G [d], such that Er contains a monochromatic 
hyperedge. Let Er = e\,... ,e]^ and let e^* be a hyperedge in Er whose vertices all receive 
the same color. 

By reordering the matrix M^, we may assume that the basis B of Mr contains the first ^ 
columns, thus i* > i. Let denote the Fth column of Mr. Since m,;* is not contained in 
the basis, there exist coefficients ai,... ,ai such that 


i 

'Y “i ■ = m*.. 

i=l 

For i G [k], define: 

{ ai if i < £; 

-1 if i = U; 

0 otherwise. 

From this definition of /3 it follows that 

k i 

• mi = ai • m* - m^. = 0. 
i-l i-1 
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Let Aj be any size (r — l)-subset of V. Since rrii^j = 1 exactly when 3 Aj, and 0 otherwise, 
we have: 

k 

^ ^ ^ ^ — 0 - 
eiDAj i=l 

By Lemma we obtain that for any partitioning Vi U V 2 of the vertices in V, 

^ ft = (-1)’' ^ ft. (1) 

BiCVi eiCV2 

Consider however the partitioning (Vi, V 2 ) given by the 2-coloring of the vertices. Then every 
edge Ci S E'^ contains at least one vertex of each color and is thereby not fully contained in 
Vi or V 2 - As such, these edges contribute 0 to both sides of the equation. The edge e^. is the 
only remaining edge with a non-zero coefficient and by assumption, it is contained entirely 
within one color class. Without loss of generality, let e^. C Vi. But then J^aCVi ~ 
while J 2 eiCV 2 ~ contradicts Q. ◄ 

To bound the size of the kernel, consider the matrix for r G [d]. Its rank is bounded 
by the minimum of its number of rows and columns, which is at most < n^~^. As 

such, we get |£i'| < rank(Mr) < Note that d < n, such that \E'\ < = 

n‘^~^ + 2 So E' contains at most hyperedges. Since a hyperedge 

consists of at most d vertices, the kernel can be encoded in 0{rA~^ ■ d ■ logn) bits. ◄ 

By a folklore reduction, Theorem gives a sparsification for NAE-SAT. Consider 
an instance of d-NAE-SAT, which is a conjunction of clauses of size at most d over vari¬ 
ables xi,... ,Xn- The formula gives rise to a hypergraph on vertex set {xi,^Xi | * G [n]} 
containing one hyperedge per clause, whose vertices correspond to the literals in the clause. 
When additionally adding n hyperedges {xi, ^Xi\ for * G [n], it is easy to see that the resulting 
hypergraph is 2-colorable if and only if there is a NAE-satisfying assignment to the formula. 
The maximum size of a hyperedge matches the maximum size of a clause and the number of 
created vertices is twice the number of variables. We can therefore sparsify an n-variable 
instance of d-NAE-SAT in the following way: reduce it to a d-hypergraph with n' := 2n 
vertices and apply the kernelization algorithm of Theorem It is easy to verify that 
restricting the formula to the representative hyperedges in the kernel gives an equisatisfiable 
formula containing 2 • {n'Y~^ G clauses, giving a sparsification for NAE-SAT. 

As mentioned in the introduction, the existence of a linear-parameter transformation m 
from d-CNF-SAT to {d + 1)-NAE-SAT also implies a sparsification lower bound for d-NAE-SAT, 
using the results of Dell and van Melkebeek [5]. Hence we obtain the following theorem. 

► Theorem 28. For every fixed d > 4, the d-NAE-SAT problem parameterized by the number 
of variables n has a kernel with 0{n‘^~^) clauses that can be encoded in ■ logn) bits, 

but admits no generalized kernel of size for e > 0 unless NP C coNP/poly. 

[Y] Conclusion 

We have added several classic graph problems to a growing list of problems for which 
non-trivial polynomial-time sparsification is provably impossible under the assumption that 
NP ^ coNP/poly. Our results for (Connected) Dominating Set proved that the linear- 
vertex kernels with 0(A:^) edges for fc-NoNBLOCKER and fc-MAx Leaf Spanning Tree 
cannot be improved to edges unless NP C coNP/poly. 
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Sparsification Upper and Lower Bounds for Graphs Problems and Not-AII-Equal SAT 


The graph problems for which we proved sparsification lower bounds can be defined 
in terms of vertices: the 4-Coloring problem asks for a partition of the vertex set into 
four independent sets, Dominating Set asks for a dominating subset of vertices, and 
Hamiltonian Cycle asks for a permutation of the vertices that forms a cycle. In contrast, 
not much is known concerning sparsification lower bounds for problems whose solution is 
an edge subset of possibly quadratic size. For example, no sparsification lower bounds are 
known for well-studied problems such as Max Cut, Cluster Editing, or Feedback 
Arc Set in Tournaments. Difhculties arise when attempting to mimic our lower bound 
constructions for such edge-based problems. Our constructions all embed t instances into 
a 2 X y/i table, using each combination of a cell in the top row and bottom row to embed 
one input. For problems defined in terms of edge subsets, it becomes difficult to “turn off” 
the contribution of edges that are incident on vertices that do not belong to the two cells 
that correspond to a ?/es-instance among the inputs to the OR-construction. This could be 
interpreted as evidence that edge-based problems such as Max Cut might admit non-trivial 
polynomial sparsification. We have not been able to answer this question in either direction, 
and leave it as an open problem. For completeness, we point out that Karp’s reduction m 
from Vertex Cover to Feedback Arc Set (which only doubles the number of vertices) 
implies, using existing bounds for Vertex Cover [5], that Feedback Arc Set does not 
have a compression of size 0(n^~^) unless NP C coNP/poly. 

Another problem whose compression remains elusive is 3-Coloring. In several settings 
(cf. [12] 1. the optimal kernel size matches the size of minimal obstructions in a problem-specific 
partial order. This is the case for d-NAE-SAT, whose kernel with 0(n'^~^) clauses matches 
the fact that critically 3-chromatic d-uniform hypergraphs have at most 0(n'^~^) hyperedges. 
Following this line of reasoning, it is tempting to conjecture that 3-Coloring does not admit 
subquadratic compressions: there are critically d-chromatic graphs with 0(n^) edges [^ . 

The kernel we have given for d-NAE-SAT is one of the first examples of non-trivial 
polynomial-time sparsification for general structures that are not planar or similarly guaran¬ 
teed to be sparse. Obtaining non-trivial sparsification algorithms for other problems is an 
interesting challenge for future work. Are there natural problems defined on general graphs 
that admit subquadratic sparsification? 
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